LtU Forum, Site Discussion

Filter-Reduce Optimization

This must be well-know amongst functional programming researchers, but I kind of stumbled upon it by accident when studying Joy.

Consider the following program to count even numbers:

define count_evens = [2 mod 0 neq] filter 0 [inc] reduce

So for those new to Joy, every program/function takes it parameter on the stack, and pushes its results onto the stack.

Here is a brief explanation of the operators:

2 = push the value two onto the stack
[] = push the contents onto the stack without evaluating them
mod = modulo
inc = increment the top item of the stack
neq = not equals

The program I showed takes a list on the stack, filters the list for only even numbers, and then counts them using a reduce function.

Assuming you are following me so far, this program can be rewritten as follows (caveat: I'm doing this off the top of my head so I could have made a mistake in syntax):

  def count_evens_1 = 0 [2 mod 0 eq [inc] [] if] reduce

What I've done here is eliminate the unneccessary filter operation and "folded" it (pun intended) into the reduce operation.

My observation when working with Joy is that this optimization is trivial to automate.

So my question to the group is: is this a well known optimization, and if so, is it as easily applied in other functional languages?

For those whose curiosity has been piqued, I'm researching the viability of using a Joy dialect as an intermediate language.

Patterns of Integer Usage

A lot of embedded systems programmers use C and C++ to develop very large and complex pieces of software. Often these people are never introduced to higher-level constructs or improvements to languages to make that software more reliable. I recently wrote an article that attempts to introduce some ideas from different languages to show their benefits. The article is available as:

Patterns of Integer Usage

The article was written for those who rarely look past C/C++/C# as implementation languages (certainly not this crowd). However, I would appreciate any criticisms or critiques this community may offer.

Natural Language Programming for Interactive Fiction

After years of effort, Graham Nelson has released version 7 of the Inform language. Inform is the language created by Infocom for the construction of interactive fiction games, such as the legendary Zork series. This latest edition of the language is notable in that it is based on a subset of English, and reads like natural-language descriptions. For example:

The LNER Mallard is a steam locomotive. The Mallard is 4-6-2.
A steam locomotive can be watered or unwatered. A locomotive is usually watered.
A man called Peter is in the Atrium. North of the Atrium is the Hall of Greek Vases.
Brightness is a kind of value. The brightnesses are guttering, weak, radiant and blazing.

These declarations create not only the terms used, but the relations between the terms. Conditions can later be tested using the domain-specific terms thus created (if the cat is on the mat...).

how can PLT experts help improve the web?

Recently many javascript libraries have started molding javascript to become more functional: Prototype provides map/filter functions for javascript arrays. jquery is providing 'chained' methods where every method of an object returns the updated object.

Phill Wadler is bringing us Links, a whole programming language for the web.

Automatic Generation of Intelligent JavaScript Programs for Handling Input Forms in HTML Documents by Tetsuya Suzuki and Takehiro Tokuda is an interesting paper about using constraint solving for web forms.

Is there any other such work being done to imporve actual web applications built every day. Are there other ideas which could help developers and users?

What do you believe about Programming Languages (that you can't prove (yet))?

The Edge "World Question Center asks the thought provoking, but overly general for this forum, question "WHAT DO YOU BELIEVE IS TRUE EVEN THOUGH YOU CANNOT PROVE IT?"

So let's tailor it a little for LtU...

What do you believe about Programming Languages (that you can't prove (yet))?

Folding neither Left nor Right (or how to avoid overspecifying the solution to a problem)

Can anyone tell me which functional languages support a non-order-specific fold, and what the name of those operations are? I read somewhere that sometimes reduce is non-order specific, but other places claims it is the same as foldl. Clearly it depends on the languages, but I don't trust those source, so I thought I'd ask my favourite group of egg-heads. :-)

It seems that specifying left or right folds when the function being used is associative is over-specifying the solution to a problem. In other words you are solving a more specific problem than stated. Commonplace in imperative code, but it should be more easily avoided in functional code. Perhaps this is moot because it might be an "easy" problem for compilers to figure out if a "clean" function (e.g. pure functional) is associative and/or commutative based on the primitives.

I am studying the problem of optimizing / parallelizing pure functional code. So any good pointers to basic primers on the internet would be much appreciated. I am particularly interested in those common-knowledge optimizations of functional code. I don't have time to purchase any books, so online references would be most appreciated. It would be cool to gather a compendium of functional optimization tips and tricks on this thread, but that might be hoping for too much.

Thanks in advance.

First-Class Traces

Some development environments provide a tracer, which allows you to observe the entering and exiting of function calls[1].

E.g. in Common Lisp you can do:

(defun flatten (list)
  (cond
    ((null list) list)
    ((atom list) (list list))
    (t (append (flatten (first list))
               (flatten (rest list))))))
(trace flatten)
(flatten '(a (b c) d))

which could output

0 FLATTEN > ...
  >> LIST : (A (B C) D)
  1 FLATTEN > ...
    >> LIST : A
  1 FLATTEN < ...
    << VALUE-0 : (A)
  1 FLATTEN > ...
    >> LIST : ((B C) D)
    2 FLATTEN > ...
      >> LIST : (B C)
      3 FLATTEN > ...
        >> LIST : B
      3 FLATTEN < ...
        << VALUE-0 : (B)
      3 FLATTEN > ...
        >> LIST : (C)
        4 FLATTEN > ...
          >> LIST : C
        4 FLATTEN < ...
          << VALUE-0 : (C)
        4 FLATTEN > ...
          >> LIST : NIL
        4 FLATTEN < ...
          << VALUE-0 : NIL
      3 FLATTEN < ...
        << VALUE-0 : (C)
    2 FLATTEN < ...
      << VALUE-0 : (B C)
    2 FLATTEN > ...
      >> LIST : (D)
      3 FLATTEN > ...
        >> LIST : D
      3 FLATTEN < ...
        << VALUE-0 : (D)
      3 FLATTEN > ...
        >> LIST : NIL
      3 FLATTEN < ...
        << VALUE-0 : NIL
    2 FLATTEN < ...
      << VALUE-0 : (D)
  1 FLATTEN < ...
    << VALUE-0 : (B C D)
0 FLATTEN < ...
  << VALUE-0 : (A B C D)
(A B C D)

As you can see, the macro TRACE only enables printing the trace. Printing a trace without having a way to access it directly/programmatically seems rather premature to me. It is evident from the above trace that traces form a recursive tree-like data type, that however is not accessible to the programmer. The entering and the exiting values from a function belong to one node of the tree (even though they are printed far apart), and the subtraces are subtrees.

Thus, continuing the above example, such a `trace' data structure might roughly resemble

(FLATTEN (A (B C) D) (A B C D)
  (FLATTEN A (A))
  (FLATTEN ((B C) D) (B C D)
    (FLATTEN (B C) (B C)
      (FLATTEN B (B))
      (FLATTEN (C) (C)
        (FLATTEN C (C))  ; X
        (FLATTEN NIL NIL)))
    (FLATTEN (D) (D)
      (FLATTEN D (D))
      (FLATTEN NIL NIL))))

Since I could not find anything about this, I was wondering the following: are there any languages (or easy ways of implementing it yourself) that can reify traces as data structures such as above that you can manipulate? In case you are wondering, there are potentially many reasons why this might be useful. E.g. one might like to compare two traces, or store a trace for later, or even consider the trace itself as a kind of higher-level program (although I have not investigated this idea in depth).

Note that I am not referring to `stack traces' (a.k.a. `backtraces'). The stack trace at the execution of location X above might be a structure resembling

(FLATTEN (A (B C) D)
  (FLATTEN ((B C) D)
    (FLATTEN (B C)
      (FLATTEN (C)
        (FLATTEN C)))))  ; X

Debuggers usually already provide many ways of manipulating the stack trace after e.g. a exception or breakpoint (although I do not know which ones are able to reify it...). Also, as I recall, one simple way of implementing continuations is copying the stack (which is part of the continuation) to the heap. So stack traces are not the issue here, nor are the `traces' as the word is sometimes used in theoretical computer science (sequences of successive states of some state transition system).

[1] It is actually an interesting question what it would mean to trace operators other than functions, but I did not go into that here.

Tom 2.3 Released

Tom is a pattern matching compiler developed at INRIA. It is
particularly well-suited for programming various transformations on
trees/terms and XML based documents. Its design follows our research
on the efficient compilation of rule based languages (e.g. ELAN,
developed at INRIA-Loria).

This release continues our work on the integration of pattern matching
and rule based programming facilities into C and Java.

Many applications have been developed in Tom. Among them, lets mention:

  • the Tom compiler itself
  • languages semantics, interpreters and program transformation tools
  • a prover for the Calculus of Structures
  • an interpreter for the Rho Calculus
  • a disunification algorithm
  • Tom is a complex compiler which adds powerful constructs to C and
    Java: non linear syntactic matching, associative matching with neutral
    element (a.k.a. list-matching), XML based pattern matching, string
    matching, and equational rewriting.
    This offers the possibility to analyze and transform any kind of
    data-structure. Tom can be used for large scale developments and
    applications. It comes with documentation, programming, and debugging support.

    This new release contains many improvements and new features:
    - a new generator of abstract data types implementations (Gom) which supports hooks. In practice, this corresponds to private data types of Caml, which ensures that terms are maintained in canonical form

    - a new %strategy construct which allows to easilly define strategies that can be combined using strategy primitives a la Stratego (All, One, Repeat, Choice, Innermost, Mu, etc.)

    - a new %[...]% construct which helps to write cide generators (it is no longer necessary to encode special characters of strings)

    Tom is available, in open source (GPL/BSD License), from the Tom web page

    Implementation of Hecl

    The Hecl scripting language as a programming language isn't something that's pushing programming language design in interesting new directions. However, as a simple, dynamic language that's implemented in Java, it's pretty easy to figure out how it works, so perhaps this article on its design and implementation is of interest to those who aren't quite as advanced or passionate as most of the LtU readership, or who are interested in something to sink their teeth into before wading into something more difficult.

    Additionally, because of the very fact that it is small and simple, it has a practical application: the runtime runs on even older, slower J2ME-enabled cell phones.

    I originally wrote the article for a European Linux magazine, but didn't like their terms, so I decided to just put it up on my web site:

    http://www.dedasys.com/articles/hecl_implementation.html

    Chuck - Concurrent audio programming language

    Not sure if this is relavent to LtU but it is a programming language so...

    http://chuck.cs.princeton.edu/

    ChucK is a new audio programming language for real-time synthesis, composition, and performance - fully supported on MacOS X, Windows, and Linux. ChucK presents a new time-based concurrent programming model, which supports a more precise and fundamental level of expressiveness, as well as multiple, simultaneous, dynamic control rates, a precise and straightforward concurrency, and the ability to add, remove, and modify code, on-the-fly, while the program is running, without stopping or restarting.

    XML feed